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Abstract 

We discuss the use of saddlepoint methods in the analysis of port- 
folios, with particular reference to credit portfolios. The objective is 
to proceed from a model of the loss distribution, given through prob- 
abilities, correlations and the like, to an analytical approximation of 
the distribution. Once this is done we show how to derive the so-called 
risk contributions which are the derivatives of risk measures, such as 
a given quantile (VaR) or expected shortfall, to the allocations in the 
underlying assets. These show, informally, where the risk is coming 
from, and also indicate how to go about optimising the portfolio. 

1 Introduction 

Problems in quantitative finance can, in the main, be put into one of two 
compartments: ascribing probabilities to events, and calculating expecta- 
tions. The first class of problems is essentially what modelling is about, 
whether in derivatives pricing or in portfolio or risk management; the sec- 
ond is of course fundamental to their practical application. The distinction 
between the two compartments is worth making. On the one hand there are 
some computational techniques such as the binomial or trinomial tree that 
are applicable to many different models and asset classes (provided the un- 
derlying dynamics are diffusive) . Conversely, there are some models that can 
be calculated using many different techniques, such as for example synthetic 
CDO models which can be tackled by numerical grid techniques, analyti- 
cal approximations or Monte Carlo, or as a second example the emerging 
application of Levy processes in derivatives theory, which give rise to mod- 
els that can be calculated using Fourier transforms, Laplace transforms, or 
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Monte Carlo. Divorcing the model from its calculation is, therefore, quite 
a useful idea; as for one thing, when something goes wrong, as frequently 
happens in the credit world, one needs to know whether it is the model or 
the calculation of it that was at fault. In my experience, it is generally the 
former, as usually what has happened is that the modellers have overlooked 
an important source of risk and implicitly ascribed far too low a probability 
to it. That said, accurate and fast calculation is important, and that is what 
I shall be talking about here. 

The construction of the distribution of losses, or of profit-and-loss, of 
a portfolio of assets is a well-established problem, whether in structuring 
multiasset derivatives, or running part of a trading operation, an investment 
portfolio or a bank. It can take many guises, according whether one is 
concerned with risk on a buy-and-hold view, or whether it is mark-to-market 
that is of more importance. 

In this chapter we shall give the first complete exposition of the sad- 
dlepoint method to the calculation and management of portfolio losses, in 
an environment that is quite general and therefore applicable to many as- 
set classes and many models. The methods described here apply equally 
well in the buy-and-hold and the mark-to-market contexts, and have been 
applied very successfully in both arenas. Their most natural application is 
in credit, as through the collateralised debt obligation (CDO) market, and 
investment banks' exposure to bonds and loans, there has been an impetus 
to understand the losses that can be incurred by portfolios of credit assets. 
Portfolios of hedge fund exposures or derivatives of other types could also 
be analysed in the same framework. What makes credit particularly de- 
manding of advanced analytics is that in general the losses that come from 
it are highly asymmetrical, with the long-credit investor generally receiving 
a small premium for bearing credit risk and occasionally suffering a very 
much larger loss when he chooses a 'bad apple'. This asymmetry has also 
given rise to the exploration of risk measures other than the conventional 
standard deviation, with the Value at Risk (VaR) and expected shortfall 
(also known as conditional- VaR or CVaR). Another offshoot of the pres- 
ence of credit portfolios has been the discussion of correlation in its various 
guises, and indeed in the form of CDO tranches one can actually trade cor- 
relation as an asset class. This has led to a standardisation of approach, 
with the conditional-independence approach now being almost universally 
used in the specification and implementation of portfolio models (the CDO 
world's brief flirtation with copulas being little more than another way of 
dressing up a fairly basic concept). The principle behind conditional inde- 
pendence is that the assets are independent conditionally on the outcome of 
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some random variable often called a risk-factor. By specifying the correla- 
tion this way, one has a clear interpretation of the behaviour of large-scale 
portfolios (which has led to the ASRF model of Gordy [13]) and the joint 
distribution of all assets is specified reasonably parsimoniously. 

The technique described here is called the saddlepoint approximation 
and comes from asymptotic analysis and complex variable theory. Although 
common in statistics and mathematical physics, its first application in port- 
folio theory seems to have been 1998 [3], when the present author applied 
it to the calculation of distributions arising from simple credit loss models; 
since then the method has been applied and improved by various authors 

pang mi]. 

This chapter naturally divides into two parts. First, we have the con- 
struction of the distribution of losses, which is an essential ingredient of 
valuing CDO tranches and calculating risk measures at different levels of 
confidence. Secondly, we want to understand where the risk is coming from, 
by which more formally we mean the sensitivity of risk to asset allocation, 
an idea that is fundamental to portfolio theory and risk management and 
is at the heart of the Capital Asset Pricing Model (CAPM). The two parts 
are closely connected, as we shall see that in obtaining the loss distribu- 
tion, much information about the risk contributions is already apparent and 
needs only a few simple computations to extract it. In the theory of risk 
contributions, we shall identify some problems with the sensitivity of VaR 
to asset allocation, and show that, somewhat paradoxically, analytical ap- 
proximations to VaR contribution are actually more useful than the 'correct' 
answer. We will also reestablish the fact that shortfall does not have these 
technical difficulties, and give a complete derivation of the sensitivity theory 
in both first and second order. The second-order behaviour will be shown 
to be 'well-behaved' by which we mean that the saddlepoint approxima- 
tion preserves the convexity and therefore is a reliable platform for portfolio 
optimisation (unlike VaR). 

2 Approximation of Loss Distributions 

2.1 Characteristic functions 

Portfolio analytics can, in the main, be understood by reference to the char- 
acteristic function (CF) of the portfolio's distribution. The characteristic 
function of the random variable Y, which from now on will denote the loss 
(or P&L) of the portfolio in question, is defined as the following function of 
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the comple^lj variable to: 

Cy(lo) = E[e iulY }. 

The density of Y can be recovered from it by the inverse Fourier integral 

fv(x) = ±- [°° C Y (u)e-'^ x du. (1) 

If the distribution of Y is discrete then the convergence is delicate and 
the result has to be interpreted using delta-functions: f^L, e lu) ( y ~ x ' du = 
S(y -x). 

Obviously we have to be able to efficiently compute Cy . The ingredients 
are: (i) the characteristic function of independent random variables is the 
product of their characteristic functions, and (ii) the characteristic function 
is an expectation, so one can do the usual trick of conditioning on a risk- 
factor and then integrating out. This construction is universal in portfolio 
problems, so we can review a few examples now. 



Example: CreditRisk-|- 

Consider first a default/no-default model. Let the jth asset have a loss net 
of recovery Oj and a conditional default probability Pj(V), where V is the 
risk- factor. Then 



Cy(cj) =E 



Y[(l- Pj ( V )+ Pj (V)e^) 



Make the assumption that the default probabilities are quite small, so that 
Cy(w) PS E 



(the Poisson approximation), and then impose the following model of the 
conditional default probability: Pj(V) = pj • V, so that the action of the 
risk-factor is to scale the default probabilities in proportion to each other. 
Of course the risk- factor has to be nonnegative with probability 1. Then 



C Y {oo) » E 



1 As usual i denotes J— 1. 
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where My(s) = E[e ] denotes the moment-generating function (MGF) of 
V. It is then a matter of choosing a distribution for V (apart from positivity, 
it needs to have a MGF that is known in closed form, so the Gamma and 
inverse Gaussian distributions are obvious candidates). Incidentally Cred- 
itRisk+ is configured to have exposures that are integer multiples of some 
quantum, and this enables the loss distribution to be obtained recursiveljU. 

Example: Extended CreditRisk+; Gaussian copula 

Sometimes Cy\v 1S known in closed form but Cy is not. Here are two 
examples. 

An objection to the basic CreditRisk+ model as just described is that 
assets can default more than once, which for low-grade assets is a problem. 
In that case we cannot use the Poisson approximation, and instead have to 
calculate the CF numerically. One therefore has 

/oo 
Y[(l- Pj (v)+ Pj (v)^)f(v)dv (2) 

■°° 3 

with Pj(v) = mm(pj v, 1) and f(v) denoting the density of V (which is zero 
for v < 0). 

The Gaussian copula model requires the same treatment. In that case (|2|) 

can still be used, but with different ingredients: Pj(V) = $ 

and V ~ N(0, 1) for the risk- factor. 

It is apparent that in this framework any model can be handled, once 
the distribution of V and the 'coupling function' Pj(V) are specified. In 
fact, it is worth noting in passing that specifying a model this way leads to 
overspecification, as replacing V by h(V) (with h some invertible function) 
and pj by pj o h^ 1 leaves the model unaffected. One can therefore standard- 
ise all one-factor models to have a Normally-distributed risk-factor, which 
simplifies the implementation somewhat. 

Example: Merton model 

There are also situations in which even Cy\v 1S n °t known in closed form, 
though to find examples we have to go beyond the simple default/no-default 
model. A detailed exposition of mark-to-market (MTM) models would take 

2 Which imposes further restrictions on the distribution of V; see |15] for a full discussion 
of the Panjer recursion. 
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us outside the scope of this chapter, but here is the basic idea. We adopt 
the Merton approach, in which the holder of debt is short a put option 
on the firm's assets. The put value relates to the firm value through the 
Black-Scholes put formula (for a simple model, at least), and the values 
of differerent firms' debt may be correlated by correlating their asset lev- 
els. If the firms' asset returns Zj are distributed as N(0, 1) after suitable 
normalisation, then a simple Gaussian model is the obvious choice: 

where V is the common part and Uj the idiosyncratic part. Let the value 
of the debt be some function gj(Zj) say. Then, conditioning on V and 
integrating out the idiosyncratic return Uj, we have 

f°° / I \ e~ u2 / 2 

C Xj \v(^) = J ex.p(iujg j (f3 j V + yjl - u)j -j= du 

and the integral has to be done numerically. On top of that, the outer inte- 
gral over V has to be done. This can be viewed as a continuum analogue of 
'credit migration' (the more common discrete version having been discussed 
by Barco [5]). The simple model above with Normal distributions is not 
to be recommended for serious use because the probability of large spread 
movements is too low, but one can build more complex models in which 
jump processes are used and from the point of view of risk aggregation the 
same principles apply. 



2.2 Inversion 

Having arrived at the characteristic function, we now wish to recover the 
density, tail probability or whatever. The density of the random variable Y 
can be recovered from (TjQ) which can, for conditional independence models, 
be written in two equivalent forms: 

Mx) = J- r ' Efcy^Mie--* du (3) 



2tt 

fy(x) = E 



DO 

OO 



(4) 



(i.e. the 'outer integration' over V can be done inside the inversion integral, 
or outside). Other expressions such as the tail probability and expected 
shortfall can be dealt with similarly. 
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In simple cases, such as default /no-default with equal loss amounts, the 
most direct route is to assemble the loss distribution on a grid (see e.g. 
[8]). The distribution of independent losses is found recursively, building 
the portfolio up asset by asset, and for conditionally-independent losses one 
follows the usual "condition, find distribution, integrate-out" route: this is, 
in essence, eq. (jlj). When the loss amounts are not identical, this procedure 
can still be followed as long as one is prepared to bucket the losses so that 
they do end up on a grid. Note that for very large portfolios this method is 
very inefficient because the computation time is proportional to the square of 
the portfolio size. The method is also unsuited to continuous distributions. 

For the aggregation of arbitrary distributions the Fast Fourier Transform, 
essentially a numerical integration of the Fourier transform, is a useful tool. 
As the inversion integral is a linear function of the characteristic function, it 
does not matter which of (|3l4p is followed, though the former requires fewer 
calls to the FFT routine and is therefore preferable. A few things should be 
borne in mind about FFTs: first, the FFT is a grid method and the use of 
distributions that are not precisely represented on a grid can cause artefacts; 
secondly, although the inversion is fast, the evaluation of the characteristic 
functions at each gridpoint is not always, and this is where most of the 
computation time is spent. 

Finally we have analytical approximations e.g. Central Limit Theorem 
(CLT), Edgeworth, Saddlepoint. In their classical form these are large port- 
folio approximations for sums of independent random variables, and they 
have an obvious advantage over numerical techniques on large portfolios: 
numerical methods spend an inordinately long time computing a distribu- 
tion that the CLT would approximate very well knowing only the mean and 
variance (which are trivial to compute). These methods are nonlinear in the 
CF and so ([3]) and give different results. 

When the analytical approximation is done inside the expectation, as in 
(JU), it is said to be indirect. If outside, as in ([3]), so that the unconditional CF 
is being handled, it is direct. It is argued in [21] that the indirect method 
is 'safer', because it is still being applied to independent variables, with 
the outer integration having essentially no bearing on the accuracy of the 
approximation: by contrast, the direct method is more like a 'black box'. 
We concentrate on the indirect method that we shall be using throughout 
this chapter, with only a short section on the direct method. (In any case, 
the formulae for the direct methods will be obvious from the indirect ones.) 
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2.3 Saddlepoint approximation: concepts, and approxima- 
tion to density 

We need to introduce some more notation, in the shape of the moment- 
generating function (MGF), My(s) = E[e sY ], and when doing so we un- 
derstand that it is to be evaluated for values of s that might not be purely 
imaginary. As My(s) = Cy(uj) when s = iu, and Cy(oj) exists for all u € R 
regardless of the distribution of Y, it must hold that My(s) exists for all 
pure imaginary s. However, when using the notation My we will take it as 
read that My{s) exists for values of s off the imaginary axis (more precisely, 
in some band s_ < Re (s) < s + with s_ < < s+. This imposes the re- 
striction that the distribution of Y decay at an appropriate rate in the tails 
(exponential decay is sufficient). Note that in many examples for credit risk 
(portfolios of bonds, CDS, CDOs etc.) the maximum loss is finite and so 
this condition is automatically satisfied. It is also satisfied for anything Nor- 
mally distributed. The term cumulant-generating function (KGF) is used 
for K Y (s) = log My (s). 

The most well-known results for the approximation of the distribution 
of a sum of independent random variables 

n 

Y = J2Xj, (^)i.i.d., 

i=i 

are the Central Limit (CLT) and Edgeworth expansions, but they are found 
not to approximate the tail well: they give best accuracy in the middle of 
the distribution. So a neat idea is to change measure by 'tilting' so that the 
region of interest becomes near the mean. A convenient approach is to use 
an exponential multiplier: 

f ( v e x vfy(v) 

My) = ~MyW 

where the denominator is chosen so as to make the 'tilted' distribution in- 
tegrate to unity. It is easily verified that under the tilted measure (P) the 
mean and variance of Y are K Y {X) and K Y (X), where Ky = log My. By 
choosing K Y {\) = y, we have in effect shifted the middle of the distribu- 
tion to y. We then argue that if Y is very roughly Normal under P, and 
hence also under P, then its probability density at its mean must be about 
1/V 27TCT 2 where a 2 is its variance under P. Then 

p K Y (X)-Xy 

My) = e KYix) - Xy f Y (.y) 



^/2ttK y (X) 
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which is the saddlepoint approximation to the density of Y. The U l/V2na 2v 
approximation is a lot better than appearances might suggest. For example, 
with an exponential distribution (density e~ x ) the true density at the mean 
is e~ l and the approximation is l/y/2~TT, which is about 8% too high: this in 
spite of the fact the exponential and Normal distributions have very different 
shape. Note that, when applied to the saddlepoint approximation, this 8% 
error is observed uniformly across the distribution (because an exponential 
distribution remains exponential under tilting). 

This method, known as the Esscher tilt, is a recurring theme in more 
advanced work, particularly on risk contributions [23], which we shall de- 
velop later in this chapter, and in importance sampling [11] where it is used 
to steer the bulk of the sampling to the point of interest on the distribution 
(usually a long way out in the tail where there would otherwise be very few 
samples). One of the vital ingredients is the uniqueness of the tilting-factor. 
This follows from the convexity of Ky , which in turn follows by consideration 
of the quadratic q(t) = E[(l + tY) 2 e sY ]. As q is nonnegative its discrimi- 
nant ("6 2 -4ac") must be < 0, and so (E[y e sy ]) 2 < E[e sY ]B[Y 2 e sY ], which 
on rearrangement gives (log My)" > 0, as required. We shall encounter the 
same method of proving convexity later on, in conjunction with the expected 
shortfall risk measure. 

Although the Esscher tilt is neat, our preferred approach uses contour 
integration, and explains where the term 'saddlepoint' comes from. The 
reader is directed to [7\ for a fuller discussion of the asymptotic calculus. 
Assume first tthat Y is the sum of independent and identically-distributed 
random variables, so that Ky = nKx with Kx{s) = E[e s ^]. Expressing 
the density as a contour integral, distorting the path of integration C until 
it lies along the path of steepest descent — which is why the MGF must exist 
for s off the imaginary axis — and then applying Watson's lemma gives: 



fY{y) = 2^i/ e n(Kx(s) ~ Sy/n) ds (5) 

n \8K' x (s) 2 2AK' x {sf ) { 



e n(K x (s)-sy/n) 

y/2nnK x {§) 

where s is the value of s that makes Ky(s) — sy stationary, that is, 

K' Y (s) = y, (6) 

from which we see that s is interchangeable with A. We prefer this method 
of proof on the grounds that it is more upfront about the nature of the 
approximation, and is more useful when singular integrands such as the tail 
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probability and shortfall are being examined. The term saddlepoint refers 
to the nature of Ky{s) — sy at its stationary point: it has a local minimum 
when approached along the real axis, but a maximum when approached in 
the orthogonal direction, which is the path of integration here. A three- 
dimensional plot of the function would therefore reveal it to be in the shape 
of a saddle. By deforming the contour of integration so as to pass along the 
path of steepest descent, one avoids the problem of approximating a Fourier 
integral that is generally oscillatory (Figures 1 1121) . 




Figure 1: Real and imaginary parts of the inversion integrand M(s)e~ sy where 
M(s) = (1 — f3s)~ a (the Gamma distribution), with a = 1, (3 = 0.5, y = 1. There 
is a branch point at s = 1//3 and the plane is cut from there to +oo. The integrand 
is oscillatory for contours parallel to the imaginary axis. 

A fundamental point is that the result can be written entirely in terms 
of Ky as 

V + \8KyW ' ^KyW) + 0{K Y ^ • 

Rather than being in descending powers of n, this approximation is in 'de- 
scending powers of Ky\ One can therefore in principle use it in situations 
where Y is not the sum of i.i.d. random variables. It can thus be applied to 
sums of variables that are independent but not identically distributed, and 
this is one of the most important developments and insights in the theory 
and application. Common sense dictates, however, that the further away 
one gets from the non-identically distributed case, the less reliable the ap- 
proximation is likely to be. For example, if there is a large exposure to 
one very non-Gaussian instrument in the portfolio, a poorer approximation 
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Figure 2: Absolute value of the same inversion integrand M(s)e~ sy as Figure [H 
and path of steepest descent. The contour runs in a horseshoe from oo — 7ri, through 
1 (the saddlepoint), round to oo + 7ri. 

should be expected (and indeed the correction term is considerable in that 
case). 

2.4 Tail probability 

The probability density function is not particularly useful in practice: the 
density function for a discrete model of losses consists of spikes and the fine 
structure of these is not especially important; one usually wants to know how 
likely it is for losses to exceed a certain level without having to integrate the 
density; and the approximated density does not exactly integrate to unity. 
We therefore derive the tail probability and related quantities as integrals 
and approximate them. 

The tail probability integral is in MGF formE) 

r[Y>y] = ^[ e K ^sV-. (8) 
2vri J x+ s 

The symbol > is to be interpreted thus: 

P[Y>y]=P[Y >y] + ±P[Y = y}. 

The reason for the ^P[^ = y] term is that if Y contains probability mass 
at y then My(s)e -sy contains a constant term of strength P[y = y], and 

3 I denotes the imaginary axis in an upward direction; the + sign indicates that the 
contour is deformed to the right at the origin to avoid the singularity there (so it passes 
round the origin in an anticlockwise, or positive, direction). 
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now the interpretation of ([8]) is rather delicate: the basic idea is to subtract 
out the divergent part of the integral and observe that the 'principal value' 
of T)" -7 f 10 ° ds/s is h . Notice therefore that if the true distribution of Y is 

Z7T1 J — lOO ' 2 

discrete then a naive comparison.] of ~P[Y > y] with (|8|) (or any continuum 
approximation to ([8])) produces appreciable errors, but these have little to 
do with the quality of the analytical approximation because they arise from 
the continuity correction. In our examples, when we plot the tail probability, 
it will be P[y > y] vs y. 

Daniels [9] points out that although the two methods of deriving the sad- 
dlepoint expansion for the density (Edgeworth using Esscher tilt, or steepest 
descent using Watson's lemma) give the same result, they give different re- 
sults for the tail probability. In essence this problem arises because of the 
singularity at s = 0. 

The method of Lugannani & Rice, which is preferable and will be used 
in several places in this chapter, removes the singularity by splitting out 
a singular part that can be integrated exactly. The saddlepoint expansion 
is then performed on the remaining portion. We refer to Daniels [9] for a 
fuller exposition, but here is the idea. It is convenient when performing 
the asymptotic approximation for the term in the exponential to be ex- 
actly quadratic rather than only approximately so. This can be effected by 
changing variable from s to z according to 



which ensures that s = 44> z = and also s = s 44> z = z. Then the 
integral can be split into a singular part that can be done exactly and a 
regular part to which the usual saddlepoint treatment can be applied. The 
result is 



the celebrated Lugannani- Rice formula. As pointed out by Barndorff-Nielsen 
[6], this has the flavour of a Taylor series expansion of the function <3?(— z, + 
• • • ), which does look more like a tail probability than does; after doing 
the necessary algebra the result is 




(K Y (s) - sy) - (K Y (s) - sy) 
Ky(s) - sy 




(9) 




(10) 



4 As, for example, in a discussion by Merino & Nyfeler [151 Ch. 17]. 
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Note that both expressions are regular at s = but need careful numerical 
treatment near there: the preferred method is to expand in a Taylor series 
in s, and one obtains 

[ M V 6^(0)3/2 + 

An interesting consequence of the Lugannani-Rice formula is that, in the 
saddlepoint approximation, 

|^lnP[y>y] <0; (11) 

we say that the resulting distribution is log- concave. Hence the graph of 
P[V > y] vs y on a logarithmic scale, as we often plot, will always 'bend 
downwards'. Not all distributions are log-concave; for example the exponen- 
tial distribution clearly is, and more generally so is the Gamma distribution 
with shape parameter > 1, and the Normal distribution is too (by virtue of 
the inequality E[(Z — x)l[Z > x]] > 0); but the Student-t is not. To prove 
the result, note from log-concavity of the Normal distribution that 

M»(-z) - 4>(z) < 0. 

After some minor manipulations this can be recast as 



S0(Z 




By ([9]) this can be written, with P[Y > y] abbreviated to P(y), 

P"(y)P(y) < P'(y) 2 , 
frorrH which (1111) is immediate. 



2.5 Tranche payoffs and expected shortfall (ESF) 

We are now in a position to deal with the ESF, which is defined by S + [y] = 
E[y | y > y] or S~[y] = E[y |y < y] (according as Y denotes portfolio loss 

5 Differentiation w.r.t. y slots in a factor of (— s), so the first term on the LHS is — lx 
the derivative of the density. 
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or portfolio value) where y is the VaR at the chosen tail probability. The 
integral representation of the ESF is 



E[Yl[Y^y]] = ^-J^K Y (s)e 



1 f ' - K Y (s)-sy ^£ 



S 



By following the methods that we applied to the tail probability, we write the 
singularity in the integrand as the sum of a regular part and an analytically 
tractable singular part near s = 0: 

K y( s ) _ l*Y_ + K' Y (s)-HY + Q ^ 

s s s 

with hy the mean of Y. The singular part then gives rise to the same 
integral as the tail probability, calculated earlier, and the regular part can 
be given the usual saddlepoint treatment. We have 

E[Y1[Y % y}} ~ » Y P[Y % y] ± ^_^ /y(y ), (12) 

which is easily evaluated using the expressions for the density and the tail 
probability. Division by the tail probability then gives the ESF. As usual the 
formula is exact if Y is Normally distributed, as can be verified longhand. 

There is a close link to tranche payoffs (calls and puts on the loss distri- 
bution), as follows. When computing the payoff in a tranche it is necessary 
to find the difference between two call payoffs, where the call strikes (denoted 
y in what follows) are the attachment and detachment points of the tranche. 
The 'call' and 'put' payoffs are C+ = E[(Y - y)+] and C~ = E[(y - Y)+] 
which have integral representation 

1 I . , , , _„„ ds 

x± 



(incidentally the put-call parity formula Cy — C~ = \iy — y can be inferred 
by combining the two paths, which collapse to a loop round the origin, where 
the residue is My (0) — y = fJ-y — y)- Integration by parts reduces the double 
pole to a single pole: 



6 From now on we shall assume that the distributions in question are continuous, so 
we shall be less careful about distinguishing > from >. This avoids the complication of 
defining VaR and ESF for discontinuous distributions (see [l] for the generalisation). 



14 



clearly a close relative of the ESF integral. (This can also be arrived at by 
noting that E[(Y — y) + ] = E [(Y — y)l[Y > y]] , which is what we did above 
for ESF.) Following the same route as with ESF we obtain 

C±~( M -y)P[Y%y] + ^/ y (y). 
2.6 Examples 1 (Independence) 

It is now time for some numerical examples. We take a default/no-default 
model for ease of exposition. In each case we obtain the true distribution by 
inverting the Fourier integral numerically (by the FFT [26J ) , the saddlepoint 
approximation and the Central Limit Theorem. This allows the analytical 
approximations to be verified. Different portfolio sizes and different default 
probabilities are assumed, as follows. 

• Figure [3) 10 assets, exposure 10, default probability 1%. Apart from 
the obvious problem of trying to approximate a discrete distribution 
with a continuous one, the saddlepoint accuracy is reasonably uniform, 
across the distribution. 

• FigureH) 100 assets, exposure 4, default probability 1%. This portfolio 
is more fine-grained, but note that even with 1000 assets the CLT is 
appreciably in error for higher quantiles. 

• Figure [5j 100 assets, unequal exposures (median 5, highest 50), de- 
fault probabilities variable (typically 0.1-4%; lower for the assets with 
higher exposure). Again the saddlepoint approximation works well 
and accuracy is roughly uniform across the distribution. 

• Figure[13 100 assets, a couple of very large exposures (median 1, high- 
est 150), default probabilities variable (typically 0.04-4%; lower for 
the assets with higher exposure). 

The last case is extreme and has a few features that make it difficult 
to deal with: a large step in the loss distribution caused by the very bi- 
nary nature of the portfolio (if the biggest asset defaults, a huge loss is 
incurred; otherwise the losses are of little importance). In mark-to-market 
models there is a continuum of possible losses so this situation does not 
arise. Furthermore, even if a situation like this does crop up, the approx- 
imation is erring on the side of being conservative about the risk. Given 
that the period of the current 'Credit Crunch' has witnessed many allegedly 
'very low probability' events, this form of approximation error is unlikely 



15 



to result in bad risk management decisions. In the context of a portfolio 
optimisation, the method would immediately start by chopping back such 
large exposures — and once it had done that, the approximation error would 
decrease anyway. 

Later on (Figure [To - ]) in discussing risk contributions we shall give another 
example of a smallish inhomogeneous portfolio, showing that the approxi- 
mation works well. Other examples are given in |23] 119]. the latter showing 
the results of applying the method to assets that are Gamma distributed (a 
continuous distribution) . 




Loss Shortfall 

Figure 3: Equal exposures, very small portfolio (10 assets). •=Saddlepoint, 
o=CLT. Exact result (which is 'steppy') shown by unmarked line. 




Loss Shortfall 

Figure 4: As above but for larger portfolio (100 assets). 
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Loss Shortfall 



Figure 5: Inhomogeneous portfolio (100 assets). The largest exposure is 10 x 
median. 



J3 




Loss Shortfall 



Figure 6: An unlikely case: Extreme heterogeneity in which largest exposure 
is 150 x median. The true distribution has a step in it at loss=150, which the 
saddlepoint approximation attempts to smooth out, thereby overestimating risk at 
lower levels. 



2.7 Accuracy and asymptoticity of saddlepoint approxima- 
tion 

To understand why the saddlepoint approximation is accurate it is necessary 
first to understand in what sense the asymptotic expansion is being taken: 
the method is asymptotic for a large number of independent random vari- 
ables being added, though as we have just seen, the number need not be very 
large in practice. Typically the relative accuracy when the method is used 
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only for addition of independent random variables is roughly uniform across 
the whole distribution [IT] rather than just being a 'tail approximation'. 

For small portfolios, one can easily develop direct computations for es- 
tablishing the loss distribution, essentially by considering all possible com- 
binations of events. Asymptotic expansions work for portfolios that have 
a large number of assets. Why do we regard asymptotic methods as more 
useful? Essentially because there are more large numbers than there are 
small ones! 

We have said that a 'large number' of random variables have to be added, 
and to understand what 'large' means in practice we can look at the first 
correction term ([5]). Clearly the smaller the higher cumulants the smaller 
the error. However, certain non-Gaussian distributions give very small er- 
rors: for example the error vanishes altogether for the inverse Gaussian 
distribution, while for the Gamma distribution with shape parameter a it 
is — l/(12na) which is small when the shape parameter of Y, which is na, 
is large. (If Y is exponentially distributed then the correction term can be 
calculated easily from ([7]) and it works out as — A, for all s; this almost ex- 
actly cancels the aforementioned 8% error in the leading-order term. Jensen 
[16] has a more complete discussion.) 

Being more adventurous, one could even forget about independence and 
then use the saddlepoint approximation as a black box, as suggested in [23] . 
This is the 'direct approach'; we shall return to it later. 

2.8 Conditionally-independent variables 

By inserting the outer integration over the risk-factor we obtain all the re- 
sults for conditionally- independent variables. This means that s, \x and the 
density and tail probability are all factor-dependent. Treating things this 
way, we divorce the question of the saddlepoint approximation's accuracy 
from the choice of correlation model, as the approximation is only being 
applied to the distribution of a sum of independent variables. This is im- 
portant, because the subject of correlation is a vast one and one could never 
hope to check the method's accuracy for 'all possible correlation models'. 
Making the choice of correlation model irrelevant is therefore a useful step. 
For the density, one has 
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and for the tail probability, 



P[Y>y] «E 



while for the shortfall, 
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with P + ,P~ the upper and lower tail probability (subscripts V and \V de- 
note the dependence on V). Incidentally the Central Limit Theorem would 
give the following, which is obtained by taking sy — > 0: 
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with fiy\v an d ""yiv denoting the mean and variance of the portfolio condi- 
tional on the risk-factor; this can be obtained directly. 

It is worth mentioning at this point the granularity adjustment which 
gives a formula for VaR when a small amount of unsystematic risk is added 
to the portfolio. This can be derived from the above equations. Now if we 
wish to incorporate the effects of unsystematic risk we can model the loss as 
Y = Yca + U, with Yqo = E[y | V] (the loss of the putative 'infinitely granular 
portfolio', which need not exist in reality) and U denoting an independent 
Gaussian residual of variance a 1 which can depend on Y^. The difference 
between the upper P-quantiles of Y^ and Y is given by the granularity 
adjustment (GA) formula ([25] and references therein): 



V&R P [Y] ~ VaRpfFo, 
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where / is the density of Y^. The shortfall-GA is 
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Note that the correction to shortfall is always positive (and analytically 
neater), whereas the correction to VaR is not; we will be discussing this 
issue in more detail later, and essentially it follows from non-coherence of 
the VaR risk measure [31 Q] . 
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The equation f)15[) and its simpler Central Limit analogue are the sum 
of two pieces and it is attractive to interpret them as, respectively, the 
contributions of systematic and specific risk to the portfolio ESF. This is 
because the first term is related to the variation of the conditional mean of 
the portfolio {fiy\v) with the risk-factor and the second term is related to 
the residual variance (cr Y \y) not explained by the risk-factor. Roughly, the 
first term is proportional to the interasset correlations (R-squared in KMV 
terminology or f3 2 in CAPM) and the second to the reciprocal of the portfolio 
size. It turns out that this decomposition is analogous to the well-known 
result for the standard deviation, 

V[Y] = V[E[Y\V]] +E[V[Y\V]]. 

This and related issues are discussed in |22j. 

Note that the log-concavity property derived earlier is destroyed by the 
mixing operation, so in the conditionally-independent case it is only the V- 
conditional distributions that are log-concave; the unconditional distribution 
may well not be. 

2.9 Computational issues 

It is worth mentioning some of the computational issues needed to implement 
a workable calculator. A large proportion of the computation time is spent 
in the root-searching for s: the equation K' Y \y{sy) = y has to be solved for 
each V, and if one is finding the VaR for some given tail probability, the 
loss level (y) will have to be adjusted in some outer loop. Consequently, it is 
worth devoting some effort to optimising the root-searching. The first thing 
to note is that the saddlepoint is roughly given by s ~ (y— K'(0))/K"(0), as 

a development of K(s) around s = gives K(s) = K'(0)s + \K"(<d)s 2 H 

(which would be exact for a Normal distribution). The value K" (ti)~ l l 2 , 
which is the reciprocal of the standard deviation, is a convenient 'unit of 
measurement' that tells us how rapidly things vary in the s-plane: note that 
s has dimensions reciprocal to Y, so if typical losses are of the order 10 6 
USD, then typical values of s will be of order 10 _6 USD _1 . Secondly, as 
K' is an analytic function, it makes sense to solve K'(s) = y by Newton- 
Raphson, but higher-order variants (i.e. formulas that take into account the 
convexity of K') generally converge much more quickly and have a wider 
basin of convergence. This means that K"'(s) will need to be known for any 
s. It follows that the routine that evaluates Ky\v( s ) had better calculate 
three s-derivatives. (Recall also that K"'(0) is explicitly used in a special 
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case of the tail probability calculation when s = 0.) A well-optimised root- 
searching routine should be able to find s to machine precision in about 
three trials. 

2.10 Examples 2 (Conditional independence) 

As an example of correlated assets we take a default/no-default model under 
the one-factor Gaussian copula. (To remind: this means that the conditional 

default probability of an asset is p{V) = $( $ Jff~f V ), where V ~ N(0, 1) 

is the risk-factor, p is the expected default frequency and /3 is its correla- 
tion.) The characteristics are: 50 assets, exposures net of recovery mainly 
1-5 units, with one or two larger ones; p 0.2%-3%; /3's equal. Figures ITHlOl 
show the results, comparing with Monte Carlo with 1 million simulations. 
Four different values of f3 are considered to show the full range of correlations 
that might reasonably be applied in practice. It is not surprising that the 
accuracy is good, as all that is happening (in essence) is that results anal- 
ogous to those displayed previously are being mixed together for different 
values of the risk-factor. 




Loss Shortfall 

Figure 7: Correlated test portfolio in Example 2: /3 = 0.3. 



2.11 The direct saddlepoint approximation 

So far our exposition of analytical approximations has been confined to sums 
of independent random variables, and extension to conditionally-independent 
variables has been trivial (condition on risk-factor, find distribution, inte- 
grate out) — which is the intuitively obvious way of doing it. Experience 



21 



1 



0.1 



0.01 



0.001 



JO 

o 
u 
a 



0.1 



0.01 



0.001 



0.0001 




0.0001 




Figure 8: As above but with /3 = 0.5. 
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Figure 9: As above but with j3 = 0.7. 



shows however that the direct approach ([3]), where the saddlepoint method 
is applied to the unconditional MGF, can also be quite effective. The main 
attraction of the direct approach is that if Cy (w) = E[Cy|y (w)] is known in 
closed form, there is no subsequent integration over V to be done; and the 
saddlepoint no longer depends on the risk-factor, which makes for simple 
computation. (The formulas are obvious, as one simply uses the uncondi- 
tional MGF in the cases derived for independence.) The direct approach 
also has the mathematical elegance of being a natural extension of mean- 
variance theory, which is recovered in the limit s — > (see |23l 124] ): the 
indirect approach does not do this, because the saddlepoint is a function of 
the risk-factor. But how justifiable is the direct approach? 
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Figure 10: As above but with j3 = 0.9. 



Suppose, for the purposes of illustration, that it is the exposures rather 
than the default probabilities that depend on the risk-factor (so that corre- 
lated exposures are being modelled, as perhaps in a model of counterparty 
risk in OTC derivatives). Let us also assume the risk- factor to be lognor- 
mally distributed and that each counterparty's exposure is a constant mul- 
tiple of that factor. Now focus on what happens when the factor integration 
is done. In the indirect approach it is the conditional tail density, prob- 
ability, etc., that is being integrated over the risk- factor, which causes no 
problems. But in the direct approach, it is the conditional MGF that is be- 
ing integrated prior to the saddlepoint approximation being done: and that 
integral doesn't exist, because the lognormal distribution is too fat-tailed: 
the integral is 

POO 

M Y (s) = E[e sY \V}= \1(1-Pj + Pje s ^ a )iP(a) da 
Jo J 

where pj are the default probabilities and Oj the average exposures, and 
ijj denotes the lognormal density. So in this sort of situation the direct 
approach cannot possibly work. Therefore the indirect approach is prima 
facie more generally applicable. 

On the other hand the direct method does work very well for some 
models: 

• Feuerverger & Wong [10] used it to approximate the distribution of a 
quadratically-transformed multivariate Gaussian variable, which they 
used as a general tool for market risk problems. 
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• Gordy [13] used it on the CreditRisk+ model, which we have described: 
the loss distribution is a Poisson distribution whose mean is stochastic 
and Gamma-distributed. 

• Previously to both, Martin ([IE]; see also [IS]) used it to approximate 
a Gamma distribution integrated over a Poisson, as a prototypical in- 
surance or reliability problem. Explicitly: events occur as a Poisson 
process and each time an event occurs a loss is generated. The distribu- 
tion of total loss over some time period is required. In other words the 
distribution of X^=i Xi 1S required, where X{ are i.i.d. Gamma(a, 0) 
and N is Poisson(#). By conditioning on N and integrating out, we 
find the MGF to be 

My ^ = E ^T-( 1 " P s )~ ra = exp(0((l " i3s)~ a ~ 1)) 

and, rather conveniently, the equation K' Y (s) = y can be solved for s 
algebraically. As a check, the loss distribution can also be calculated 
directly as an infinite series of incomplete Gamma functions. 

The first problem can be boiled down to the sum of independent Gamma- 
distributed variables, and in the other two, the distributions at hand are 
from exponential families that are known to be well-approximated by the 
saddlepoint method. By 'exponential family' one means a distribution in 
which uK(s) is a valid KGF for all real v > 0. Such distributions are then 
infinitely divisibl^]. It is therefore not surprising that the direct saddlepoint 
approximation worked well. However, there is no uniform convergence result 
as for a sum of independent random variables. 

This is a difficult area to make general comments about. We suggest 
that the direct method be applied only to exponential families. This in- 
cludes what is mentioned above, or for example portfolio problems in which 
the joint distribution of assets is multivariate Normal Inverse Gaussian or 
similar. 

3 Risk contributions 

The objective behind risk contributions is to understand how risk depends 
on asset allocation. Often, the precise risk number for a portfolio is not 

7 Which connects them to Levy processes, an issue which the reader may wish to pursue 
separately [27] . 
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the most important issue, as for one thing that is critically dependent on 
the parameters such as default probabilities or spread volatilities and dis- 
tributions, default or spread correlations, and the like. It is therefore more 
important to use the risk measure comparatively, i.e. compare two portfolios 
or one portfolio over time. In other words, the real issue is: What makes 
the risk change? This means that we wish to calculate derivatives of VaR 
and ESF with respect to the asset allocations in the portfolio. It turns out 
that these quantities provide an explicit answer to the following fundamen- 
tal question: Given that I lose more than a certain amount, which assets, 
positions or scenarios are likely to have been responsible? We shall see that 
these so-called 'contributions' sum to give the portfolio risk. From the point 
of view of the portfolio manager, this information is easily presented as a list, 
in decreasing order, of the biggest contributions, or, if you like, the biggest 
'headaches'. Once these have been identified, it is clear how to improve the 
efficiency of the portfolio: one identifies the riskiest positions and, if these 
are not generating a commensurate expected return, one hedges them or 
chops them down. 

In the context of a credit portfolio, which instruments generate the most 
risk? Clearly large unhedged exposures to high-yield names will be the 
biggest, but there is more to it than that. For one thing, the more correlated 
a name is with the portfolio, the more risk it will contribute. Secondly, 
what is the tradeoff between credit quality and exposure? It is common 
sense that banks will lend more to so-called 'less-risky' names than riskier 
ones, usually based on credit rating. But should one lend twice as much 
to a AA as to a BB, or twenty times? The answer to that depends on the 
risk measure. For tail-based measures such as VaR and ESF, poor-quality 
assets are not penalised as much as they are with the standard deviation 
measure. This is because the worst that can happen to any credit is that it 
defaults, regardless of credit rating; so with VaR at 100% confidence, the risk 
contribution is just the exposure (net of recovery). So a VaR- or shortfall- 
based optimisation will produce a portfolio with a lower proportion of so- 
called high-grade credits than a standard-deviation-based one, preferring 
to invest more in genuinely riskfree assets such as government bonds or in 
'rubbish' such as high-yield CDO equity that has a small downside simply 
because the market is already pricing in the event of almost-certain wipeout. 
To give a quick demonstration of this, FigurefTTIcompares standard deviation 
contribution with shortfall contribution in the tail for a default/no-default 
model. Tail risks and 'standard deviation risks' are not equivalent. 

In the context of a trading book, the considerations are different. Books 
are not generally conceived as portfolios, particularly on the sell-side, and 
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Figure 11: VaR and shortfall contributions, as % of portfolio risk, compared in a 
default/no-default model. 'A' and 'B' are tail risks (large exposures to high-grade 
assets) and 'C is the opposite: a smallish exposure to a very low-grade credit. In 
a standard deviation based optimisation, C generates most risk, with A being of 
little importance; in a tail-based one, A contributes most risk. 



positions may be turned over rapidly (potentially within a day). Given this, 
there is no interpretation of any position having an expected return, which 
would make an optimisation seemingly pointless. However, the concept of 
risk contribution is still important, even if risk is now to be defined as a 
mark-to-market variation at a much shorter time horizon such as a couple 
of weeks; default is still included, as a particularly large move in value, but 
for most credits will generally be too unlikely to show up on the 'risk radar'. 
The risk contribution is to be interpreted as a way of determining positions 
that contribute far too much risk in relation to the profit that they might 
conceivably generate, such as levered positions on so-called 'safe' names. 

One would expect that, given (almost-) closed-form expressions for tail 
probability and shortfall, it should be relatively simple to obtain deriva- 
tives either by direct differentiation of the approximated tail probability 
and shortfall, or alternatively by performing the saddlepoint approximation 
on the derivatives. It turns out that the second route gives slightly neater re- 
sults and that is the one that we shall pursue here. Following option-pricing 
parlance, we refer to the first and second derivatives of risk measures w.r.t. 
their asset allocations as their 'delta' and 'gamma'. 

We start by developing the VaR contribution as a conditional expec- 
tation. It turns out that there are some fundamental conceptual difficul- 
ties with VaR contribution, notably for discrete-loss models such as Cred- 
itRisk+. Paradoxically, the exact contribution is an ill-posed problem and 
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analytical approximations are considerably more useful in practice. The 
shortfall contribution suffers less from these problems. For the shortfall it 
is known that the measure is convex, and it turns out that the saddlepoint 
approximation preserves that convexity. This is an important result because 
it shows that use of the saddlepoint-ESF will give a unique optimal portfolio 
in an optimisation. We will show examples along the way. 

3.1 VaR 

Our starting-point is the expression for the upper tail probability 

P+ = P[Y > y] = -L / C Y (u;)e-^y— (19) 
2m Jn+ w 

and we now wish to perturb the VaR (= y) while keeping the tail probability 
constant. This gives 

dP+ 1 f (3Cy ._ _dy„ f ._,\ 



dcij 2tt\ J-ji\daj ' U daj^ Y ^ J 6 ^ 
(the integrand is regular, so the contour can go through the origin). As 

= i-E^Ei = io;E[X 7 -e iwy ] (20) 

OCLj OClj 

we have 

dy EjX^Y-y)] 

daj- nS(Y-y)} ~ B[X ^ Y - y] (21) 
which is the well-known result. From 1-homogeneity of VaR we should 
have Ylj a jdy/d a i = Hi which is clear from the above equation. The VaR 
contribution is (|21f) multiplied by Oj, so the VaR contributions add to the 
VaR. 

Before pressing on with the theory let us consider some practical implica- 
tions of this result. First, as the VaR contribution requires one to condition 
on a precise level of portfolio loss, it is quite difficult to estimate by Monte 
Carlo simulation. Two approaches that have been applied are kernel esti- 
mation, which uses information from simulations where the portfolio loss is 
close to the desired level, and importance sampling, in which the 'region of 
interest' is preferentially sampled by changing measure. A strength of the 
analytical approaches is that they are free of Monte Carlo noise. 



a TZ denotes the real axis, with + indicating that the contour is indented so as to pass 
anticlockwise at the origin, and — clockwise. 
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A second issue is not so well-known but fundamentally more worrying: 
even if it were easily computable, the exact VaR contribution might not be a 
very sensible construction. The problem occurs with discrete distributions, 
for example in grid-based models such as CreditRisk+ [15], or CDO pricing 
algorithms [2] [H] and in Monte Carlo simulation of any model. The following 
example is quite illuminating: with a default /no- default model of portfolio 
loss, let the exposures net of recovery be 

9,8,18,9,8,20,17, 16,12,12. 

The following assertions may come as a surprise: 

• At portfolio loss=40, the VaR contribution of the first asset is zero 
(regardless of the default probabilities or correlation model). It is 
possible for a loss of 40 to occur (16+12+12, 8+20+12, etc.) but none 
of the relevant combinations includes a loss of 9. 

• However, the VaR contribution increases if the exposure is decreased by 
1 unit (assuming that the VaR does not change thereby): 8+20+12=40. 
So VaR contribution is not a 'sensible' function of exposure. 

• Also, the first asset does contribute if the loss level is 38 (9+9+20, 
etc.) or 41 (9+20+12, etc.). So VaR contribution is not a 'sensible' 
function of attachment point either. 

• As a consequence of the previous point, the position changes markedly 
if another asset is added: for example, addition of another asset will in 
general cause the VaR contribution of the first asset to become nonzero 
(e.g. an additional exposure of 1 gives 9+20+12+1=40). 

Figure [12] illustrates the problems for this portfolio. For simplicity we 
have assumed independence of the losses and given each asset a default prob- 
ability of 10%. The exact VaR deltas can be computed by direct calculation 
of the Fourier integral above, using the Fast Fourier Transform Algorithm. 
The figure shows the VaR delta for the first asset, for different loss levels. 
The most striking thing is that the graph is, to put it colloquially, all over 
the place. The delta is not even defined at some loss levels because those 
losses cannot occur (e.g. loss of 11). By way of motivation, let us point in 
Figure [T2l to some more results that we shall derive presently. First, we have 
plotted the saddlepoint VaR delta for asset 1 as a function of loss level, and 
observe that the graph is nice and smooth, appearing to steer some sort 
of median course — and indeed, a polynomial fit to the exact VaR delta is 



28 





Figure 13: (Left) Shortfall gammma of one particular asset as a function of VaR. 
Dots show the exact result; the curve is the saddlepoint result. (Right) VaR and 
shortfall in the portfolio. For each, the dotted line shows the approximation, and 
the solid one shows the exact result. 



quite close. The adjacent plot shows the shortfall deltas, which are much 
less 'random', and again the saddlepoint approximation picks up the impor- 
tant feature of the graph very well. Similar remarks apply to the second 
derivative ('gamma') of shortfall in Figure [131 We have also plotted VaR 
and shortfall as functions of tail probability to compare the saddlepoint ap- 
proximation with the exact answer, and find them to be very close, despite 
the small size (10 names) and inhomogeneity of the portfolio. 

Intriguingly, working with the VaR of a continuum model of a portfolio 
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distribution actually works much better in practice. This leads us to the sad- 
dlepoint approximation. Incidentally Thompson and coworkers [29] arrive 
at essentially the same equations through an application of the statistical 
physics concept of an ensemble, in which one looks at an average of many 
independent copies of the portfolio. We recast (fTUj) using the MGF, writing 
the V-expectation outside the inversion integral (i.e. following the 'indirect' 
route) : 
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Differentiating w.r.t. the asset allocations gives 
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(note that the integrand is now regular at the origin). Performing the sad 
dlepoint approximation, we arrive at 
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If we define the tilted conditional expectation Ey[-] 
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S=Sy 

(note, when sy = we have Ey = Ey) then we have 

1 dK YW ~ 

From ()2ip . (|23p and Bayes' theorem, we can associate the conditional loss 
with the saddlepoint approximation, 



B[Xj\Y 



y,V] -EviXj 



(25) 



Of course, it is also possible to derive this directly, i.e. without recourse to 
VaR, simply by writing down the integral representation of E[Xj5(Y — y)] 
and performing the saddlepoint approximation. 

As a specific example, in the case of a default/no-default model with 
independence we have the contribution of the jth asset as 



p jaj , 



Pj 



Pje a i- 



1 - pj +Pje a 



9 Pj, a,j are as before the default probabilities and losses. 



30 



which is its expected loss in the exponentially tilted probability measure 
with tilting factor ajs. It is important to note that the default probability 
is thereby weighted with a loading that increases exponentially with loss, 
making large loss events show up on the 'risk radar' even if they have low 
probability. The same idea is used in importance sampling [ITJ, in which 
tilting the probabilities to increase them allows the point of interest in the 
loss distribution to be preferentially sampled. Indeed, the change of measure 
thereby performed is that which moves the expected loss of the distribution 
to the desired value while keeping the distribution's entropy as high as pos- 
sible (another concept that links to statistical physics). When correlation 
is introduced the situation is not much different, as is apparent from (|23p . 
for one is only averaging a similar expression over different states of the 
risk-factor. 

It is worth noting that, for each value of the risk-factor, the saddlepoint 
VaR 'delta' Ey[Xj] is an increasing function of the threshold y (because 
K Y \y(s) = J2j Kxj\v( a j s ) i n that case and K' x , v is an increasing func- 
tion). But this is a property of saddlepoint approximations, not a general 
result: as we already know from Figure \12\ where for simplicity the losses 
are independent, the exact VaR contribution is not necessarily an increasing 
function of the threshold. So the saddlepoint approximation is doing some 
sort of desirable 'smoothing'. For positively correlated losses the saddlepoint 
approximation will again give monotonicity, but if they are negatively corre- 
lated then this will no longer be so: an asset negatively correlated with the 
portfolio will on average have a lower value when the portfolio value is high, 
and vice versa. (The reason for the non- monotonicity is that /y|y(y)//y(y) 
is not necessarily an increasing function of y; for the case of independence 
though, it is always unity.) 

A final advantage of the saddlepoint approximation is its computational 
efficiency Surprisingly perhaps, the calculation of saddlepoint risk contri- 
butions requires no more work than calculating the risk in the first place. 
The reason for this is that, by conditional independence, 

K Y \v{s) =^2,K x .\ v (ajs) 
j 

and so 

1 dK Y \ v 

Now, when K Y ,y(§v) = y was solved for sy, the derivatives of K X] \v were 
calculated (the first derivative is of course essential; second and higher order 
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ones are used for Newton-Raphson iterative schemes) . One only has to cache 
these derivatives (for each j and V) during the root-searching routine and 
recall them when it is desired to calculate the VaR contribution. The same 
remark applies to both derivatives shortfall, when we derive them. This 
compares favourably with other methods in which differentiation must be 
carried out for each asset in the portfolio [28J. 

To round this section off we show the special case where sy — > 0, which 
generates the CLT. If the portfolio loss distribution conditional on V is 
Normal with mean \x Y \y and variance then the conditional density 

and distribution function are, respectively, 



with 



f Y \ v = Bia-^zv)], P[Y <y] = E[$(*y)], 

y - vy\v 



z v 



cr Y \v 



(note that this is actually the same z as we used in conjunction with the 
Lugannani-Rice formula earlier). It is then readily established, and also 
verifiable from first principles, that 



dy 



Mv) 



E 



dnr\v da Y \v 

+ Zy 



da j 



1 



daj J a Y \v 



4>(z v ) 



(26) 



3.2 Shortfall 
First derivative 

The starting-point is the Fourier representation of S [Y]: 

nY\Yiv]=V*^J n± Cr(«)e-^% (27) 

From this we obtain derivative information via the obvious route of differ- 
entiating w.r.t. cij at constant tail probability. Note that because the tail 
probability is being held fixed, y will depend on the (aj); however, we know 
all about that from the VaR derivations, and conveniently some cancellation 
occurs en route. This gives 

dS ± [Y] _ T 1 f dC Y e _ iwy duj _ 
daj 2ttP ± J-ji± daj uj 2 
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Using (pO|) we obtain 
dS ± [Y] 



±1 



2mP ± 



B[X je ^]e 



1 



E[X,-l[y ^y]] =E[^-|y^y]. 



(28) 



It is clear that the homogeneity result is obeyed: V ■ aj<9S [V]/<9aj = S [Y]. 

We said when discussing the problems of VaR contribution for discrete 
distributions that ESF is smoother than VaR, and in fact it is one order 
of differentiability smoother. This means that the first derivative of ESF 
is a continuous function of asset allocation, but, as we shall see later, the 
second derivative of ESF is as 'bad' as first derivative of VaR (so it is dis- 
continuous). So we still want to work everything through for the analytical 
approximations, which is what we do next. 

The saddlepoint analogue of the equations is 



dS ± [Y] ±1 



da i 



2vriP ± 



E 



9K Y\V c K v]v (s)-sy ^£ 



I± 



da 



(29) 



(and the integrand now has only a simple pole at the origin). Following 
[20] we split out a singular part that can be integrated exactly and leave a 
regular part to be saddlepoint-approximated. The result then followj^l: 

dS ± \Y] 



1 



da i 



E 



27riP± 

1 

x± s 



± 



ldK Y \v 
s 



D K Ylv (s)-sy 



X± 



da 



■J 



H X] \v)e K ^ s) - sv ds 



1 

P± 



E 



fi x . lv P[Y^y\V]±±- 



1 OK 



Y\V 



sy daj 



To obtain the conditional-Normal result, we can either let §v 
derive from first principles. As 

1 -E [hy\vHtzv) ± <?y\v<P(zv)} 



iiXj\v J fy\v(y) 
(30) 
or 



s±[y] 



we hava 11 ! 



as^[v] 

daj 



E 



d/j 



Y\V 



daj 



HTzv) ± 



d 



a Y\V ^ 

daj 



(31) 



(32) 



1 °l- L x j \v denotes the conditional expectation of the jth asset. 

11 It may appear that some of the terms have gone missing, on account of zv depending 
on the asset allocations; however, these terms cancel. A nice thing about working with 
ESF is that much of the algebra does come out cleanly. 
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As we pointed out earlier in the context of the shortfall at portfolio level 
(and in [20] I22j). shortfall expressions naturally fall into two parts, the first 
being systematic risk and the second unsystematic. We can see that the 
same holds true for the deltas, in ([32]) and ([30]) . The systematic part can 
be written 



expressed in ungainly fashion, it is the expectation of the conditional-on- 
the-risk-factor-expectation of an asset, conditionally on the portfolio losing 
more than the VaR. The discussion surrounding Figure [15] later will show this 
in practice: correlated assets or events contribute strongly to the first part, 
which roughly speaking is proportional to the asset's 'beta', and uncorrelated 
ones mainly to the second. 

Note also that the contribution to unsystematic risk must be positive, for 
both approximations, and for the exact result too, because conditionally on 
V the assets are independent and so any increase in allocation to an asset 
must cause the conditional variance (= unsystematic risk) to increase. This 
is exactly what we would like to conclude (see [22] for a fuller discussion). 
VaR does not have this property, which means that one could in principle 
be in the awkward position of increasing the exposure to an uncorrelated 
asset, keeping the others fixed, and watching the VaR decrease. 

Second derivative 

One of the main things that we shall prove here is that the saddlepoint ap- 
proximation preserves the convexity of the shortfall. The convexity arises in 
the first place because the Hessian matrix (second derivative) of the shortfall 
can be written as a conditional covariance, so we show that first. 
Differentiating (|27p again gives 



and the second integral is already known to us because we dealt with it 
when doing the VaR delta. As 



E[E y [V,]|y^]]; 





Lj 2 V[X 3 X k e^ Y \, 
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we can tidy things up to arrive at 

= ±101(E [ X 3 X k \Y = y ] -nX ] \Y = ymX k \Y = y ] ) 



±Mv) 



V[X J ,X k \Y = y] (33) 



where V[] denotes covariance. Hence the second derivative of ESF is (up to 
a factor) the conditional covariance of the pair of assets in question, con- 
ditionally on the portfolio value being equal to the VaR. As any covariance 
matrix is positive semidefinite the ESF is a convex function of the asset 
allocations. It is also clear that 

j,k 

as the LHS is, up to a factor, 

^ a jak V[Xj , X k | Y = y] = V[Y, Y \ Y = y) = 0. 

This is as expected: if the risk measure is 1-homogeneous then scaling all the 
asset allocations by some factor causes the risk to increase linearly (so the 
second derivative iz zero) and so the vector of asset allocations must be a null 
eigenvector of the Hessian matrix. Incidentally, because the conditioning is 
on a particular portfolio value or loss, the estimation of second derivative 
of ESF is as troublesome as that of the first derivative of VaR for discrete 
portfolio models or Monte Carlo. 

We now turn to the saddlepoint approximation, which as we shall see 
contains a trap for the unwary. The second derivative of ESF has (after 
using (f23l) to tidy it up) the following integral representation: 



a i s ± [y] ± 

dajda k P 

dy dy 
dcij da k 



12 

ijoa k uaj 

■My) } (35) 







{■ 


2vri Sx± ( 







& K Y\V + 9K Y \y dK Y \y \ eKyiv{s y sy ds 

dajda k dcij da k / s 2 



As the integrand is regular at the origin, it looks obvious to approximate 
the integral as 

( 1 d*K Y] y i dKyjy l dK Y]v \ 



s 2 dajda k s dcij s da k 
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This expression is incorrect and if it is used, one ends up violating (|34p and 
producing expressions for the conditional asset covariances that do not make 
sense. [Prom now on, we abbreviate Ky\v( s ) to K.] 

The problem is that we have been careless about the size of the terms ne- 
glected in the asymptotic expansion and ended up by omitting a term that is 
of the same order as shown above. In recalling that the saddlepoint expan- 
sion is asymptotic as K — > oo, the error becomes clear: (dK/daj)(dK / dak) 
is order K 2 , and therefore needs a higher-order treatment at the saddlepoint 
to make the neglected terms of consistent order. Suffice it to say that the 
correct expression for the integral i j^l 

1 d 2 K IdK IdK 1 / d 1 dK\ ( d 1 8K\ \ 

+ 7 777T. ' 7 777T ~ 177777a \ TT'TTTT. ) \ 7T~ 77777 ( fY\v{y)- 



s 2 ddjdak s ddj s dak K"(s) \ds s daj J \ds s dak 



On collecting terms we arrive at an expression that naturally divides into 
two parts: 

9 2 s ± [y] ±i. 

daj dak i- >± 

or equivalently, by 



with 



E[(Hf k + Hy k )fY\ v (y)\, 



V[X 3 ,X k \Y = y} ~ _ E [(flf fc + fl^)/y| V (y)] , 



H s = ( LI* _ °L) ( J.™ - °!L ) m 

J \§v daj daj J \sv dak dak 



H 



u 



1 d 2 K 1 /SI dK\ f d IdK 



Sy dajdak K"{s\r) \ds s daj J s=g \ds s da. 



s=s v \ ~" " k/ s =s v 

It is clear that H s is a positive semidefinite matrix (because it is of the form 
Vjv k ); also V,.;, = b y 

By homogeneity properties of K (it depends on aj and s only through 
their product), 

d 2 K d 1 dK ^ 1 d 2 K 

2^ a ^- e 7zr = 2^ a ^ ak 



ds 2 ds s daj ^ s 2 dajdak 

J j,k J 



12 Actually a term is still missing from this as there are two terms of the form 
(d/dsfis^dK/dcij) x s^dK/ddk- However, omission of these does not violate (I34[) . 
and they also vanish for the Normal case. 
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so k cijCikH^ = too, and (fM)) is satisfied. The only thing to settle now 
is whether H u is positive semidefinite, and this is basically the Cauchy- 
Schwarz inequality. Indeed, defining the quadratic 



q(t) = E 



'Y - E[Y]) + tu- (X -E[X]) 



(we drop the y suffix for convenience), where X is the vector of asset values 
and u is some arbitrary vector, we find the coefficients of l,t, t 2 in q(t) to 
be: 



1 : E 
t : 2E 

i 2 : E 



y - E[y]) 

u-[X-E[X])(Y-E[Y}) 

2 



d 2 K 

ds 2 

3 ds s da^ 



u ■ 



(X - E[X]) 



E 

j.k 



UjUk d 2 K 
s 2 da j da,\ 



As q can never be negative its discriminant must be < 0, so 

2 



d 2 K ^ ujUk d 2 K 

ds 2 ' s 2 dajdak ~ 
j,k 



d 1 dK 

ds s daj 



which amounts to Ylj k u j u kHj I k — 0> as required. We conclude that the 
saddlepoint approximation has preserved the convexity. 

In the conditional-Normal framework, we have a rather simpler expres- 
sion. In the same notation as before: 



a 2 s±[y] ±i 



daSah 



E 



a 



_ 2 dzy dzy 
Y \ y daj da^ 



+ °Y\V 




dajda^ ) Y \ y 



(37) 



Predictably, we can identify the first term as a convexity contribution to 
systematic risk and the second to the unsystematic risk. As we can ex- 
pect from the saddlepoint derivation, both terms are positive semidefinite 
matrices: the first is of the form VjVk with Vj = ay\ydzy /daj, while the 
second is the Hessian of the standard deviation (a convex risk measure) . As 
the mixture weights ay^y^izy) are nonnegative — they are the ^-conditional 
density of Y — the LHS must also be semidefinite. 
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Conditional covariance 



There are some interesting issues surrounding the covariance of two assets 
conditional on the portfolio loss being some level which, as we have seen, 
is the second derivative of shortfall. We earlier found the approximation to 
the expectation of an asset conditionally on the portfolio value (again we 
drop the y), as a tilted expectation: 

E[Ze sY ] 



E[X j \Y = y]~-E[X j ], E[Z] 



B[e sY ] 

It is superficially attractive, but incorrect, to extend this to higher-order 
expressions, thereby surmising 

77 ~~ 

E[XjX k | Y = y] ~ B[XjX k ] 

and (defining the tilted variance in the obvious way, i.e. as the tilted expec- 
tation of the product minus the product of the tilted expectations) 

77 ^ 

V[X j ,X k \Y = y]~V[X j ,X k \. 

The problem with this is that if one multiplies by ajak and sums over both 
suffices one ends up with 

V[Y,Y\Y = y]^V[Y,Y]. 

The LHS must be zero, of course (it is the variance of something that is not 
being allowed to vary), but the RHS is d 2 Ky jds 1 7^ 0. In other words, the 
homogeneity relation is wrong. 

The reason for the error is that terms have gone missing in the deriva- 
tion, and those terms are the ones alluded to in the saddlepoint derivation 
abovS The quadrat, (covariance) expression should read 

~ v[x,,y]v[jr fe ,y] 
viXj, x k 1 y = y] ~ viXj, x k ] v[y,y] ' 

In effect, the extra term corrects the homogeneity relation by subtracting 
the unwanted convexity along the line of the asset allocations, exactly as in 
the following method of killing eigenvalues in a symmetric matrix fi: 

n* = n- na ® na 

a'Qa 

If Q is positive semidefinite then so is f2* (by Cauchy-Schwarz); and as 
£1* a = 0, £1* has one more zero eigenvalue than £1. 



13 The error occurs in [30] . where higher-order generalisations were also stated without 
proof. Correcting these seems to require a fair amount of reworking of their theory. 
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Figure 14: VaR and shortfall in 50-asset portfolio. For each, the dotted line shows 
the approximation, and the solid one shows the result obtained by Monte Carlo. 




0.4 0.6 o.f 

Systematic 

Figure 15: For all the assets in the portfolio, this plot shows their shortfall con- 
tribution divided into systematic and unsystematic parts. Correlated assets sit on 
the bottom right, large single-name exposures on the top left. Assets in the bottom 
left contribute little risk. 



3.3 Examples 

We conclude with some examples of the theory in action. 

We revisit the example of Figures ITHTOl of a default / no-default model 
under the one- factor Gaussian copula, where now we mix up the /3's, so 
that some assets have (3 = 0.3, some 0.5 and some 0.7. Figure [15] shows 
the systematic and unsystematic ESF contributions (130p of all the assets, 
in a scatter graph. The asset at the far top-left of the graph (coordinates 



39 




Allocation to asset Allocation to asset 

Figure 16: [Left] Shortfall (solid line) vs quadratic approximation (dotted), as 
asset allocation is varied. The approximation requires only the shortfall, delta and 
gamma in the 'base case', whereas the exact computation requires a full calculation 
at each point. [Right] Again for varying that particular asset allocation, this shows 
the delta of the asset and the systematic delta. The systematic delta remains 
roughly constant so the inference is that more unsystematic risk is being added to 
the portfolio. 

(0.27,0.56)) is the largest exposure and is to a name with low default prob- 
ability, i.e. a tail risk; those on the bottom right have higher correlation and 
higher default probability, though generally lower exposure. The interpreta- 
tion of Figure [13] for the purposes of portfolio management is that the assets 
in the top left need single-name hedges and those in the bottom-right can 
largely be dealt with using baskets or bespoke index protection. 

Next we consider what happens when one particular exposure is var- 
ied. Figure [161 shows the performance surface (risk vs allocation), and also 
shows a quadratic approximation based on the first and second derivatives 
estimated using (|30I36[) . The importance of this is that the derivatives can 
be easily calculated along with the risk in one go, after which the quadratic 
approximant is easily drawn, whereas tracing the exact line requires the 
portfolio to be re-analysed for each desired asset allocation. 

4 Conclusions 

We have given a reasonably complete overview of the theory of risk aggre- 
gation and disaggregation of distributions that are not necessarily Normal. 
The results from Fourier transformation are completely general; the sad- 
dlepoint ones assume the existence of the moment-generating function, and 
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are thereby suited to semi-heavy-tailed distributions. This assumption is 
sufficiently restrictive to allow useful and interesting results to be derived, 
particularly in the subject of risk contributions; and it is sufficiently gen- 
eral to allow application to real-world problems. The subject has developed 
over the last ten years and is now gaining general acceptance. In general 
we prefer to operate with shortfall rather than VaR and routinely use the 
formulae here for the risk contributions and the like. 

It is worth emphasising however that, no matter how clever the tech- 
niques of calculation, the results are only as good as the model assumptions 
used to provide the input. In the 'credit crunch' these have generally been 
found to be woefully inadequate. However, the methods described here to 
perform fast aggregation and disaggregation of risk are an important part 
of the backbone of credit risk management systems and are to be recom- 
mended. 

Summary of results 

To help the reader navigate the formulas, the following summarises where 
the results can be found. First, the basic results for the risk measures and 
related quantities: 
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Secondly, for the derivatives: 
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